Interactive iterative image annotation

ABSTRACT

A system and computer-implemented method are provided for annotation of image data. A user is enabled to iteratively annotate the image data. An iteration of said iterative annotation comprises generating labels for a current image data part based on user-verified labels of a previous image data part, and enabling the user to verify and correct said generated labels to obtain user-verified labels for the current image data part. The labels for the current image data part are generated by combining respective outputs of a label propagation algorithm and a machine-learned classifier trained on user-verified labels and image data and applied to image data of the current image data part. The machine-learned classifier is retrained using the user-verified labels and the image data of the current image data part to obtain a retrained machine-learned classifier.

CROSS-REFERENCE TO PRIOR APPLICATIONS

This application is the U.S. National Phase application under 35 U.S.C.§ 371 of International Application No. PCT/EP2019/081422, filed on Nov.15, 2019, which claims the benefit of European Patent Application No.18207896.4, filed on Nov. 22, 2018. These applications are herebyincorporated by reference herein.

FIELD OF THE INVENTION

The invention relates to a system and computer-implemented method forinteractive image annotation, for example to delineate an anatomicalstructure in a medical image. The invention further relates to acomputer-readable medium comprising instructions to perform acomputer-implemented method.

The invention further relates to a workstation and imaging apparatuscomprising the system, and to a computer-readable medium comprisinginstructions for causing a processor system to perform thecomputer-implemented method.

BACKGROUND OF THE INVENTION

Image annotation is widely used in various field, including but notlimited to the medical field. In the latter example, image annotation isoften used to identify anatomical structures in medical images, e.g., bydelineating the boundaries of the anatomical structure, by labelling ofthe voxels enclosed by the boundaries, etc. Such image annotation isalso referred to segmentation or delineation. Besides the medical field,there are also various uses for image annotation in other fields.

It is known to perform image annotation automatically. However, fullyautomatic image annotation is challenging and often does not produce therequired accuracy. For example, in the medical field, clinicalapplication such as radiotherapy planning, pre-operative planning, etc.may require a sufficiently accurate annotation to produce reliableresults. In recent years, learning-based methods such as model-basedsegmentation or deep (machine) learning approaches have shown greatpromise for automatic image annotation. These methods, however,typically require large amounts of manually labeled data which is timeconsuming and laborious and thus expensive to obtain. In addition,pre-trained algorithms can only be provided for the most common clinicaltasks, but there is a large variety of further tasks where imageannotation needs to be performed efficiently and accurately.

It is also known to perform image annotation semi-automatically, e.g.,in an interactive manner as described in “Interactive Whole-HeartSegmentation in Congenital Heart Disease” by Pace, D. F. et al., MICCAI2015, pp. 80-88. Although this avoids the need for the large amounts ofmanually labeled data, such image annotation may be less accurate than awell-trained learning-based approach, and/or require more interactiontime of the user compared to a learning-based approach which is trainedon a sufficiently large amount of image data.

US 2018/061091A1 describes multi-atlas segmentation which applies imageregistration to propagate anatomical labels from pre-labeled atlases toa target image and which applies label fusion to resolve conflictinganatomy labels produced by warping multiple atlases. Machine learningtechniques may be used to automatically detect and correct systematicerrors produced by a host automatic segmentation method.

SUMMARY OF THE INVENTION

It would be advantageous to obtain a system and method which facilitatesaccurate annotation while needing less manually labeled data and/orrequire less interaction time of the user.

In accordance with a first aspect of the invention, a system is providedfor annotation of image data. The system may comprise:

an input interface configured to access the image data to be annotated;

a user interface subsystem comprising:

-   -   a user input interface configured to receive user input data        from a user input device operable by a user;    -   a display output interface configured to provide display data to        a display to visualize output of the system;

a processor configured to, using the user interface subsystem, establisha user interface which enables the user to iteratively annotate theimage data, wherein an iteration of said iterative annotation comprises:

-   -   the processor generating labels for a current image data part        based on user-verified labels of a previous image data part;    -   via the user interface, enabling the user to verify and correct        said generated labels to obtain user-verified labels for the        current image data part;

wherein the processor is further configured to:

generate the labels for the current image data part by combiningrespective outputs of:

-   -   a label propagation algorithm which propagates the user-verified        labels of the previous image data part to the current image data        part, and    -   a machine-learned classifier for labeling of image data, wherein        the machine-learned classifier is trained on user-verified        labels and image data and applied to image data of the current        image data part; and

retrain the machine-learned classifier using the user-verified labelsand the image data of the current image data part to obtain a retrainedmachine-learned classifier.

A further aspect of the invention provides a computer-implemented methodfor annotation of image data. The method may comprise:

accessing the image data to be annotated;

using a user interface, enabling a user to iteratively annotate theimage data, wherein an iteration of said iterative annotation comprises:

generating labels for a current image data part based on user-verifiedlabels of a previous image data part;

via the user interface, enabling the user to verify and correct saidgenerated labels to obtain user-verified labels for the current imagedata part;

wherein the labels for the current image data part are generated bycombining respective outputs of:

-   -   a label propagation algorithm which propagates the user-verified        labels of the previous image data part to the current image data        part, and    -   a machine-learned classifier for labeling of image data, wherein        the machine-learned classifier is trained on user-verified        labels and image data and applied to image data of the current        image data part; and

retraining the machine-learned classifier using the user-verified labelsand the image data of the current image data part to obtain a retrainedmachine-learned classifier.

A further aspect of the invention provides a computer-readable mediumcomprising transitory or non-transitory data representing instructionsarranged to cause a processor system to perform the computer-implementedmethod.

The above measures provide an input interface for accessing image datato be annotated. For example, the image data may be a 2D or 3D medicalimage which comprises an anatomical structure which is to be segmentedby annotation of image elements, such as pixels or voxels, of the imagedata.

An interactive and iterative annotation mechanism is established asfollows. The image data is partitioned, either explicitly or implicitly,into image data parts such as image slices or image sub-volumes. Theseimage data parts are annotated iteratively. During the iterativeannotation, a previous image data part may contain user-verified labelsproviding an annotation of the previous image data part. A current imagedata part, e.g., representing an image data part following the previousimage data part, is annotated as follows. Here, the term ‘following’ mayrefer to the current image data part being annotated in a ‘following’iteration after the previous image data part, but also to the currentimage data part ‘following’ the previous image data part within theimage data, e.g., by representing an adjacent image slice or in generalthere being a spatial and/or temporal (in case of spatial-temporal imagedata) relation between the previous and current image data part.

A label propagation algorithm is used to propagate the labels of theprevious image data part to the current image data part. Such labelpropagation algorithms are known per se, e.g., from the publication ofPace et al. as cited in the background section, and typically usesimilarity in image data between the previous image data part and thecurrent image data part to propagate the labels from the previous imagedata part to the current image data part. In addition, a machine-learnedclassifier is used which may be trained on user-verified labels andimage data, for example the user-verified labels and image data of theprevious image data part, or the user-verified labels of other imagedata, e.g., of a previous image. The machine-learned classifier isapplied to the current image data part and thereby provides a labellingof the current image data part. The outputs of the label propagationalgorithm and the machine-learned classifier are combined to obtaincombined labels for the current image data part (also named ‘generatedlabels’).

Since such generated labels may be imperfect, the user is enabled toverify and correct the generated labels, e.g., using a user interfacesuch as a Graphical User Interface (GUI), thereby obtaining theuser-verified labels. Mechanisms for correcting labels are known per sein the field of image annotation.

In a following iteration of the iterative annotation, theseuser-verified labels may be used to generate the labels for thefollowing image data part to be annotated. Specifically, theuser-verified labels are propagated to the following image data part.Additionally, the machine-learned classifier is retrained using theuser-verified labels and the image data of the current image data part.Here, the term ‘retraining’ includes both comprehensive retraining, butalso a partial retraining of the machine-learned classifier. Forexample, if the machine-learned classifier is a neural network which isretrained after each image slice or image sub-volume, only select layersor nodes of the neural network may be retrained, for example to limitthe computational complexity of the retraining. In other embodiments,the retraining may be performed between the iterative annotation ofdifferent image data, for example after having completed the iterativeannotation of the image data. In a specific example, all user-correctedlabels of all image data parts may be used together with the respectiveimage data to retrain the machine-learned classifier.

Thereby, the user-verified labels are used to improve themachine-learned classifier for the following iteration or followingimage data. For example, if the image data represents a 2D or 3D medicalimage from a time-series of 2D or 3D medical images, the following imagedata to which the image annotation may be applied may be a following 2Dor 3D medical image from said time-series.

The above measures effectively correspond to a specific combination ofan interactive and iterative annotation mechanism, such as the one knownfrom Pace et al., and a machine-learned classifier. The machine-learnedclassifier is periodically, e.g., between iterations of the interactiveannotation or between different image data, retrained usingautomatically generated labels which have been verified, and if neededcorrected, by a user. The user-verified labels thus represent additionaltraining data for the machine-learned classifier which may becomeavailable over time by way of the iterative annotation and which may beused to retrain the machine-learned classifier. As such, themachine-learned classifier typically improves over time, e.g., with eachiteration or with each image.

At the same time, the user does not need to provide all of the trainingdata, but in fact may only need to verify and correct labels which wereautomatically generated. Namely, these labels are generated bypropagating the labels of a previous image data part and by themachine-learned classifier as trained thus far. As the accuracy of thelabeling by the machine-learned classifier gradually improves, the usermay gradually need to provide fewer corrections. The iterativeannotation may thus gradually become more automatic and require feweruser corrections.

In the above measures, the label propagation algorithm effectively dealswith the machine-learned classifier having initially little or notraining data, and thereby a relatively low accuracy, which may be(well) below that of the label propagation algorithm. The labelpropagation algorithm may thus initially ensure adequate automaticlabeling and effectively serve as bootstrap for the automatic labeling,as otherwise the user would have to initially manually generate alllabels. The above measures thereby facilitate accurate image annotation,in which the user's part in ensuring the accuracy by verifying andcorrecting labels is gradually reduced as the accuracy of themachine-learned classifier improves over time. The above measuresfurther need less manually labeled data than a traditional machinelearning approach where the user has to manually label all of thetraining data.

In the above and following, the labels ‘current’ and ‘previous’ aremerely used to distinguish between the image data parts betweendifferent iterations of the annotation. Subsequent use of the ‘current’image data part in retraining thus refers to the particular image data,without implying any other currentness. The phrasing “retrain themachine-learned classifier using the user-verified labels and the imagedata of the current image data part” is thus to be understood to referto the part of the image data for which said user-verified labels weregenerated.

Optionally, the processor is configured to retrain the machine-learnedclassifier between iterations of the iterative annotation, or betweenthe iterative annotation of different image data, e.g., different imagesor sets of images.

The processor may be configured to generate the labels for the currentimage data part by combining the respective outputs of the labelpropagation algorithm and the machine-learned classifier by weighting.For example, if the label propagation algorithm and the machine-learnedclassifier each provide a probability map or function, both outputs maybe weighted, e.g., using a convex weighting function, to obtain acombined probability map or function which represents or provides thelabels. In this respect, it is noted that a probability function mayyield a probability map, e.g., indicating probabilities in a map-likeformat for respective image elements of the image data part to which thealgorithm/classifier is applied.

Optionally, the processor is configured to adjust the weighting duringthe iterative annotation of the image data, or between the iterativeannotation of different image data. The weighting between the labelpropagation algorithm and the machine-learned classifier may be adjustedduring the iterative annotation of the image data, e.g., betweeniterations or between image data, e.g., between the iterative annotationof different images.

Optionally, the processor is configured to determine the weighting ofthe iterative annotation based on a metric quantifying an annotationaccuracy of the label propagation algorithm and/or the machine-learnedclassifier. There exist metrics in the field of image annotation forquantifying the annotation accuracy, such as the DICE coefficient asdescribed elsewhere in this specification. Such metrics may for exampleuse the user-corrected labels as a ‘ground truth’, but it is alsopossible to generate application-specific metrics which do not rely on aground truth to give a coarse indication of annotation accuracy. Forexample, there may be an expectancy that the annotation has a certainshape in view of the expected object shape. Significant deviations fromthis expected shape may be considered in such a metric as indicative ofa lower annotation accuracy. By estimating the annotation accuracy ofeither or both of the label propagation algorithm and themachine-learned classifier, the weighting may be adjusted, for exampleby increasing a weighting of the respective output which is deemed bythe metric to represent a higher annotation accuracy, relative to theother output which is deemed by the metric to represent a lowerannotation accuracy. In a specific example, the weighting may beadjusted between iterations or between image data, e.g., between theiterative annotation of different images.

Optionally, the metric quantifies the annotation accuracy based on adifference between i) the output of the label propagation algorithmand/or the output of the machine-learned classifier and ii) theuser-corrected labels. The user-corrected labels may be advantageouslyused as ‘ground truth’ for the metric. Namely, a large differencebetween a respective output and the user-corrected label may indicate alower annotation accuracy compared to a small or no difference at all.

Optionally, the processor is configured to adjust the weighting byincreasing a weighting of the output of the machine-learned classifierrelative to the output of the label propagation algorithm. The relativeweighting of the output of the machine-learned classifier may beincreased based on the knowledge or assumption that the annotationaccuracy of the machine-learned classifier improves over time.

Optionally, the processor is configured to start the weighting of theoutput of the machine-learned classifier at or substantially at zero ata start of the iterative annotation. The relative weighting of theoutput of the machine-learned classifier may be initially substantiallyzero based on the knowledge or assumption that the machine-learnedclassifier is undertrained and thus provides inadequate annotationaccuracy. Initially, the generated labels may be primarily determined bythe label propagation algorithm.

Optionally, the output of the label propagation algorithm and/or theoutput of the machine-learned classifier is a probability map, or one ormore control points defining a contour. Although the iterativeannotation is primarily described with reference to probability maps,this is not a limitation, in that also other types of annotation may beprovided by label propagation and by machine-learned classifier. Forexample, the annotation may be a contour which is based on controlpoints. A weighting of the contours provided by the label propagationalgorithm and the machine-learned classifier may for example compriseweighting parameters defining the relative or absolute location and/orother aspects of the control points.

Optionally, the user interface enables a user to select or define anannotation task, and the processor selects one of a number ofmachine-learned classifiers based on the annotation task, for examplefrom an internal or external database. Each machine-learned classifiermay be intended to provide a task-specific labeling, for example toenable annotation of different types of anatomical structures, differentclinical tasks, etc. During or after iterative annotation, the selectedmachine-learned classifier may be retrained based on the user-verifiedlabels and the image data. Accordingly, over time, a number ofmachine-learned classifiers may be obtained which are better trained forrespective annotation tasks.

It will be appreciated by those skilled in the art that two or more ofthe above-mentioned embodiments, implementations, and/or optionalaspects of the invention may be combined in any way deemed useful.

Modifications and variations of the workstation, the imaging apparatus,the method and/or the computer program product, which correspond to thedescribed modifications and variations of the system, can be carried outby a person skilled in the art on the basis of the present description.

A person skilled in the art will appreciate that the system and methodmay be applied to two-dimensional (2D), three-dimensional (3D) orfour-dimensional (4D) image data acquired by various acquisitionmodalities such as, but not limited to, standard X-ray Imaging, ComputedTomography (CT), Magnetic Resonance Imaging (MRI), Ultrasound (US),Positron Emission Tomography (PET), Single Photon Emission ComputedTomography (SPECT), and Nuclear Medicine (NM). A dimension of the imagedata may relate to time.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other aspects of the invention will be apparent from andelucidated further with reference to the embodiments described by way ofexample in the following description and with reference to theaccompanying drawings, in which

FIG. 1 shows a system for iterative annotation of image data, whichcomprises a user interface subsystem to enable a user to interact withthe system during the iterative annotation, for example to verify andcorrect labels;

FIG. 2 illustrates an iteration of the iterative annotation;

FIG. 3 shows prediction functions of a label propagation algorithm(left), a neural network (middle) and a 60%-40% weighted combination ofboth prediction functions (right), while further showing the labelswhich are generated based on the respective prediction functions anddifference with respect to a ground truth labeling;

FIG. 4 illustrates a validity of the prediction functions of a labelpropagation algorithm and a neural network with respect to a groundtruth labeling;

FIG. 5 shows a method for iterative annotation of image data; and

FIG. 6 shows a computer-readable medium comprising instructions forcausing a processor system to perform the method.

It should be noted that the figures are purely diagrammatic and notdrawn to scale. In the figures, elements which correspond to elementsalready described may have the same reference numerals.

LIST OF REFERENCE NUMBERS

The following list of reference numbers is provided for facilitating theinterpretation of the drawings and shall not be construed as limitingthe claims.

-   -   020 data storage    -   022 data communication    -   030 image data    -   032 label data    -   060 display    -   062 display data    -   080 user input device    -   082 user input data    -   100 system for interactive iterative annotation    -   120 input interface    -   122 internal data communication    -   140 processor    -   142 internal data communication    -   144 internal data communication    -   160 memory    -   180 user interface subsystem    -   182 display output interface    -   184 user input interface    -   200 label image    -   210 image slice    -   220 propagation-based prediction function    -   230 neural network-based prediction function    -   240 combined prediction function    -   250 predicted label image    -   260 difference to ground truth labeling    -   300-302 prediction functions    -   310-312 derived labels    -   320-322 difference to ground truth labeling    -   400 image slice    -   410 ground truth slice labeling    -   420 propagation-based prediction function    -   425 propagation-based validity function    -   430 neural network-based prediction function    -   435 neural network-based validity function    -   500 method for interactive iterative annotation    -   510 accessing image data of to be annotated    -   520 generating labels    -   530 enabling user to verify and correct labels    -   540 retraining neural network    -   600 computer-readable medium    -   610 non-transitory data

DETAILED DESCRIPTION OF EMBODIMENTS

The following embodiments are described with reference to the medicalfield. However, the techniques described in this specification can alsobe applied in other technical fields where image annotation is desiredor needed. Any references to ‘medical image’, ‘anatomical structure’,etc. are thus to be interpreted as equally applying to another type ofimage containing another type of object.

The machine-learned algorithm is, by way of example, a neural network.However, also other types of machine-learned classifiers may be used,including but not limited to Support Vector Machines (SVM), AdaBoost andRandom Forest, see for example the publication ‘Comparison ofClassifiers for Brain Tumor Segmentation’ by L. Lefkovits et al., IFMBEProceedings, Volume 59.

FIG. 1 shows a system 100 for annotation of image data. The system 100comprises an input interface 120 configured to access image data. In theexample of FIG. 1 , the input interface 120 is shown to be connected toan external data storage 020 which comprises the image data 030. Thedata storage 020 may, for example be constituted by, or be part of, aPicture Archiving and Communication System (PACS) of a HospitalInformation System (HIS) to which the system 100 may be connected orcomprised in. Accordingly, the system 100 may obtain access to the imagedata 030 via external data communication 022. Alternatively, the imagedata 030 may be accessed from an internal data storage of the system 100(not shown). In general, the input interface 120 may take various forms,such as a network interface to a Local Area Network (LAN) or a Wide AreaNetwork (WAN), such as the Internet, a storage interface to an internalor external data storage, etc.

The system 100 is further shown to comprise a processor 140 configuredto internally communicate with the input interface 120 via datacommunication 122, and a memory 160 accessible by the processor 140 viadata communication 142. The processor 140 is further shown to internallycommunicate with a user interface subsystem 180 via data communication144.

The system 100 is further shown to comprise a user interface subsystem180 which may be configured to, during operation of the system 100,enable a user to interact with the system 100, for example using agraphical user interface. The user interface subsystem 180 is shown tocomprise a user input interface 184 configured to receive user inputdata 082 from a user input device 080 operable by the user. The userinput device 080 may take various forms, including but not limited to acomputer mouse, touch screen, keyboard, microphone, etc. FIG. 1 showsthe user input device to be a computer mouse 080. In general, the userinput interface 184 may be of a type which corresponds to the type ofuser input device 080, i.e., it may be a thereto corresponding type ofuser device interface 184.

The user interface subsystem 180 is further shown to comprise a displayoutput interface 182 configured to provide display data 062 to a display060 to visualize output of the system 100. In the example of FIG. 1 ,the display is an external display 060. Alternatively, the display maybe an internal display. It is noted that instead of a display outputinterface 182, the user interface subsystem 180 may also compriseanother type of output interface which is configured to render outputdata in a sensory-perceptible manner to the user, e.g., a loudspeaker.

The processor 140 may be configured to, during operation of the system100 and using the user interface subsystem 180, establish a userinterface which enables the user to iteratively annotate the image data.Herein, an iteration of said iterative annotation comprises theprocessor 140 generating labels for a current image data part based onuser-verified labels of a previous image data part, and via the userinterface, enabling the user to verify and correct said generated labelsto obtain user-verified labels for the current image data part. Theprocessor 140 may be further configured to generate the labels for thecurrent image data part by combining respective outputs of a labelpropagation algorithm which propagates the user-verified labels of theprevious image data part to the current image data part, and a neuralnetwork for labeling of image data. The neural network may be trained onuser-verified labels and image data and applied by the processor 140 toimage data of the current image data part. The processor 140 may befurther configured to retrain the neural network using the user-verifiedlabels and the image data of the current image data part to obtain aretrained neural network. For example, such retraining may be performedbetween iterations, e.g., to obtain a retrained neural network for asubsequent iteration, or between different image data.

As a result of the iterative annotation of the image data 030, labeldata 032 may be obtained representing the annotation of the image data030. The label data 032 may be stored by the system 100, e.g., in thedata storage 020 or elsewhere, e.g., in association with the image data030, or displayed to the user, etc.

This operation of the system 100, and various optional aspects thereof,will be explained in more detail with reference to FIGS. 2-4 .

In general, the system 100 may be embodied as, or in, a single device orapparatus, such as a workstation or imaging apparatus or mobile device.The device or apparatus may comprise one or more microprocessors whichexecute appropriate software. The software may have been downloadedand/or stored in a corresponding memory, e.g., a volatile memory such asRAM or a non-volatile memory such as Flash. Alternatively, thefunctional units of the system, e.g., the input interface, the optionaluser input interface, the optional display output interface and theprocessor, may be implemented in the device or apparatus in the form ofprogrammable logic, e.g., as a Field-Programmable Gate Array (FPGA). Ingeneral, each functional unit of the system may be implemented in theform of a circuit. It is noted that the system 100 may also beimplemented in a distributed manner, e.g., involving different devicesor apparatuses. For example, the distribution may be in accordance witha client-server model, e.g., using a server and a thin-client. Forexample, the (re)training may be performed by one or more servers, e.g.,one or more cloud-based server(s) or a high-performance computingsystem.

It is noted that the label propagation algorithm and the neural networkmay be available to the processor 140 as respective datarepresentations, e.g., as algorithm data and neural network data. Suchdata representations may for example be stored in and accessed from thememory 160 and/or the data storage 020.

The following embodiments described with reference to FIGS. 2-4 assumethat the image data comprises a plurality of image slices. For example,the image data may be 3D image data which is natively partitioned inimage slices, or such image slices may be generated, e.g., bymultiplanar reformatting techniques. However, this is not a limitation,in that the following embodiments also apply to other partitionings ofthe image data into respective image data parts.

Very briefly speaking, the system 100 of FIG. 1 may enable a user tointeractively annotate a 2D image slice of the 3D image data, forexample using labels for foreground (e.g., the anatomical structure tobe annotated) and background (the surroundings of the anatomicalstructure), or using different labels for different anatomicalstructures, or using different labels for different parts of ananatomical structure, etc. Such interactive 2D annotation may beperformed using annotation tools as known from literature, e.g. asdescribed in the publication “Overview on interactive medicalsegmentation” by Zhao, F.; Xie, X., Annals of the BMVA 2013, No. 7, pp.1-22, which describes for example annotation tools enabling a user tointeractively define a spline contour, with the spline contour beingconverted by the annotation tool into a corresponding region to which alabel is applied, or a paint-brush tool for interactive label editing,or other more refined interaction methods. After annotating a 2D imageslice, the annotation may then be propagated to a neighboring imageslice, as also described in more detail with reference to FIGS. 2-4 .The neighboring image slice may then be visualized with the annotation,and the user may again use the annotation tool to correct theannotation. This process may be repeated until all image data parts havebeen annotated, or at least a subset of the image data parts which theuser wishes to annotate.

FIG. 2 illustrates an iteration of the iterative annotation. Namely,when annotating an image slice I_(i)(x) 210, a label propagationalgorithm may be used to propagate labels L_(i-1)(x) 200 to the currentimage slice. Here, I_(i)(x) and L_(i-1)(x) may be functions of the imagecoordinates x in slice i. The labels L_(i-1)(x) may have been previouslyobtained for a neighboring image slice I_(i-1)(x) (not shown explicitlyin FIG. 2 ) by way of prediction and subsequent verification andcorrection by the user, and may be propagated to the current image sliceI_(i)(x) based on the similarity of image data in both image slices. Thelabel propagation algorithm may for example be a ‘mechanistic’ labelpropagation algorithm as described in the publication of Pace et al.Here, the term ‘mechanistic’ may refer to a heuristically designedalgorithm that for example does not rely on machine learning.

The label propagation may provide a prediction function 220. Moreover,if the neural network is already sufficiently trained, the neuralnetwork may be applied to I_(i)(x) to provide a further predictionfunction 230. Both these prediction functions may for example be fuzzylabel prediction functions, for example with values between 0 and 1. Inthe specific example of FIG. 2 , these fuzzy labels may be a functionƒ_(p)(I_(i-1)(x), L_(i-1)(x), I_(i)(x)) of the input image slicesI_(i-1)(x) and I_(i)(x) and labels L_(i-1)(x) for the patch-based labelpropagation, and a function ƒ_(n,i)(I(x)) for the neural networkprediction.

Both these prediction functions 220, 230 may be combined, e.g., byweighting, into a combined prediction function 240. For example, thecombination may be a convex combination of both prediction functions220, 230:ƒ_(α)=ƒ_(n)α+ƒ_(p)(1−α)

in which the output of the neural network is weighted with a weightingfactor α between 0 and 1. FIG. 2 shows a weighting with α=0.6.

The combined prediction function 240 may be converted to predictedlabels L_(p,i)(x) 250, for example by applying a threshold of, e.g., 0.5to the values of the combined prediction function 240. For example, anyvalue below 0.5 may be assigned the label ‘background’, and any value ofor above 0.5 may be assigned the label ‘foreground’, thereby obtainingthe predicted labels L_(p,i)(x) 250. The predicted labels L_(p,i)(x) 250may then be displayed to a user, whom may then verify and correct thelabels. The thus-obtained user-verified and corrected labels Li(x) (notseparately shown in FIG. 2 ) and image slice I_(i)(x) 210 may be used torefine the neural network by performing a few training iterations usingthis data. FIG. 2 also shows a difference 260 of the predicted labelsL_(p,i)(x) 250 with a ground truth, being in this example here theuser-verified and corrected labels.

Alternatively, the user-verified and corrected labels of a number ofimage slices, e.g., all image slices of the current image, may be usedtogether to retrain the neural network, e.g., in case the user-verifiedand corrected labels of a single image slice do not represent sufficienttraining data for retraining.

By way of example, FIG. 3 shows further prediction functions of a labelpropagation algorithm 300 (left column), a neural network 301 (middlecolumn) and a 60%-40% weighted combination of both prediction functions302 (right column), as well as the labels 310-312 which are generatedbased on the respective prediction functions 300-302 and theirdifferences 320-322 with respect to a ground truth labeling. It can beseen that the prediction function 300 and labels 310 of the labelpropagation algorithm provide an ‘under-segmentation’ of the anatomicalstructure. By combining the prediction function 300 with the predictionfunction 301 of the neural network, an improved annotation of theanatomical structure is obtained, yielding a reduced difference 322 withrespect to the ground truth labeling.

As neural network and training technique, any suitable neural networkand training technique may be used, including but not limited to thosedescribed by “U-net: Convolutional networks for biomedical imagesegmentation” by Ronneberger, O.; Fischer, P.; Brox, T., MICCAI 2015,LNCS 9351, pp. 234-241, which is henceforth also simply referred to as‘U-Net’, or “Foveal fully convolutional nets for multi-organsegmentation” by Brosch, T.; Saalbach, A., SPIE Medical Imaging 2018;SPIE Proc. 105740U, henceforth also simply referred to as ‘F-Net’.

In general, there are many variants for training the neural network. Forexample, the neural network may be retrained on only the labels obtainedin the current iteration, e.g., the user-verified and corrected versionof L_(p,i)(x) 250, and the associated image data part. Such retrainingmay be a partial retraining, which may also be referred to as‘refinement’ of the neural network. Alternatively, the neural networkmay be retrained on all previous user-verified and corrected labels andassociated image data parts. Such retraining may for example beperformed after an interactive annotation session, e.g., in betweendifferent annotation sessions, or at night, etc. In some embodiments,the retraining may be performed by another entity than the entityproviding the iterative annotation functionality. For example, thesystem may be represented by a workstation for providing the iterativeannotation functionality and a server for retraining the neural network.

There are many ways to generate the weight for combining the outputs ofthe label propagation algorithm and the neural network. For example,before retraining the neural network using the current user-verified andcorrected labels, the previous labels may be used to estimate theweight. For that purpose, annotated slices may be used as starting pointand the annotation may be propagated to the next image slice using thelabel propagation:ƒ_(p,i)(x)=ƒ_(p)(I _(i-1)(x),L _(i-1)(x),I _(i)(x))

resulting in a fuzzy label map with elements 0≤ƒ_(p,i)(x)≤1. I_(i)(x)denotes slice i of the image I currently being annotated. ƒ_(p,i)(x)denotes the fuzzy label image and p denotes the label propagationalgorithm. By applying a threshold t, the fuzzy label image may beconverted into a label image:L _(i) ^(t)(x)=T(ƒ_(p,i)(x),t)

Using the weight α, the propagated fuzzy label image ƒ_(p,i)(x) may becombined with the fuzzy label image ƒ_(n,i)(x) generated with the neuralnetwork. The labels may be obtained by application of a threshold t, anda metric M that characterizes the annotation accuracy, such as the DICEvalue (also referred to as DICE coefficient) described in “Measures ofthe amount of ecologic association between species” by Dice, L. R.,Ecology, 1945, volume 26(3), pp. 297-302, may be computed. The metric Mmay be summed over all image slices with user-verified and correctedlabels, and an optimal value for the weight and the threshold may beobtained by maximizing the annotation accuracy:

$( {\alpha_{opt},t_{opt}} ) = {\arg{\max\limits_{\alpha,t}{\sum\limits_{i}{M( {T( {( {{{\alpha{f_{n,i}(x)}} + {( {1 - \alpha} ){f_{p,i}(x)}}},t} ),{L_{i}(x)}} )} }}}}$

There are also many possibilities to combine the outputs of the labelpropagation algorithm and the neural network. For example, after havingcompleted the annotation of a first and second slice of a new image, theweight α and threshold t may be optimized according to theabove-described procedure to yield optimum results for the actual imageat hand. This approach makes the neural network somewhat aware of itsown performance: if the neural network encounters a familiar case, itsweighting will increase, while in unfamiliar cases, its weighting willdecrease and the labels are mainly obtained from the label propagationalgorithm.

Instead of determining an optimal weight from previously annotated imagedata parts (e.g., previous image slices) or previously annotated imagedata (e.g., previous images), an empirical weight might be used thatdepends only or primarily on the number of previously annotated imagedata sets m, for example:

$\alpha_{m} = {e^{\frac{m}{20}}.}$

Alternatively, an empirical weight may be made dependent on the numberof previously annotated image data parts, e.g., image slices.

In general, the weight may be chosen in dependence on previous values ofthe metric M (e.g., average metric for previous examples) obtained bylabel propagation (M^(P)) and the neural network (M^(N)). The weight αmay be chosen to give a larger weight to the algorithm that provides thebetter segmentation result.

With respect to the fuzzy prediction functions, the following is noted.A fuzzy prediction function contains an implicit measure of certainty:if the value is close to 0 or 1, it may relatively certain that thederived labels are correct; however, if the value is closer to thethreshold level, e.g., of 0.5, this certainty is low. This property maybe used for a locally varying the mixing weight of the label propagationand the neural network prediction functions, which may be interpreted asa ‘validity’ of the decision function or a probability that the decisionfunction is valid.

For example, validity functions may be defined as follows:v _(p)=|2ƒ_(p)−1|v _(n)=|2ƒ_(n)−1|

The validity functions may be a local combination weighting:ƒ_(vp)=ƒ_(p) v _(p)+ƒ_(n)(1−v _(p))ƒ_(vn)=ƒ_(n) v _(n)+ƒ_(p)(1−v _(n))

An example of validity functions is shown in FIG. 4 for an image slice400, a ground truth slice labeling 410, a prediction function ƒ_(p) 420provided by the label propagation algorithm, a prediction function ƒ_(n)430 provided by the neural network, and the validity function v_(p) 425and the validity function v_(n) 435.

The validity functions can be used and combined in various ways toconstruct the fuzzy label image. For example, both validity-weightedfuzzy label images are combined by a weight β:ƒ_(β)=ƒ_(vn)β+ƒ_(vp)(1−β)

and the resulting function ƒ_(β) is combined with fa by a weight γ tothe final fuzzy label image ƒ_(γ):ƒ_(γ)=ƒ_(β)γ+ƒ_(α)(1−γ)

with both β and γ between 0 and 1. Roughly speaking, α and β are thecontribution of the neural network relative to the label propagation,and γ the contribution of the validity functions relatively to theweighting function.

Suitable mixing coefficients α, β, and γ may be obtained by the sameoptimization process as described for parameter optimization of labelpropagation. Since the neural network prediction function ƒ_(n) does notchange during optimization, it may be implemented as lookup function. Ifall label propagation parameters were constant during the optimizationof mixing coefficients, it would also be sufficient to calculate thelabel propagation prediction function ƒ_(p) once and use a lookup.

FIG. 5 shows a block-diagram of computer-implemented method 500 forannotation of image data. The method 500 may correspond to an operationof the system 100 of FIG. 1 . However, this is not a limitation, in thatthe method 500 may also be performed using another system, apparatus ordevice.

The method 500 may comprise, in an operation titled “ACCESSING IMAGEDATA OF TO BE ANNOTATED”, accessing 510 the image data to be annotated.The method 500 may further comprise using a user interface, enabling auser to iteratively annotate the image data, wherein an iteration ofsaid iterative annotation comprises, in an operation titled “GENERATINGLABELS”, generating 520 labels for a current image data part based onuser-verified labels of a previous image data part, and in an operationtitled “ENABLING USER TO VERIFY AND CORRECT LABELS”, via the userinterface, enabling 530 the user to verify and correct said generatedlabels to obtain user-verified labels for the current image data part.The method 500 may further comprise, in an operation titled “RETRAININGNEURAL NETWORK”, retraining 540 the neural network using theuser-verified labels and the image data of the current image data partto obtain a retrained neural network.

In general, the operations of method 500 of FIG. 5 may be performed inany suitable order, e.g., consecutively, simultaneously, or acombination thereof, subject to, where applicable, a particular orderbeing necessitated, e.g., by input/output relations. The operations ofthe method may be repeated in each subsequent iteration, as illustrateby the loop in FIG. 5 from block 540 to block 510. In some embodiments,the block 540 may be performed after several iterations of blocks510-530, e.g., after all image parts of the image data have beenannotated.

The method(s) may be implemented on a computer as a computer implementedmethod, as dedicated hardware, or as a combination of both. As alsoillustrated in FIG. 6 , instructions for the computer, e.g., executablecode, may be stored on a computer readable medium 600, e.g., in the formof a series 610 of machine-readable physical marks and/or as a series ofelements having different electrical, e.g., magnetic, or opticalproperties or values. The executable code may be stored in a transitoryor non-transitory manner. Examples of computer readable mediums includememory devices, optical storage devices, integrated circuits, servers,online software, etc. FIG. 6 shows an optical disc 600.

In accordance with an abstract of the present application, a system andcomputer-implemented method may be provided for annotation of imagedata. A user may be enabled to iteratively annotate the image data. Aniteration of said iterative annotation may comprise generating labelsfor a current image data part based on user-verified labels of aprevious image data part, and enabling the user to verify and correctsaid generated labels to obtain user-verified labels for the currentimage data part. The labels for the current image data part may begenerated by combining respective outputs of a label propagationalgorithm, and a neural network trained on user-verified labels andimage data and applied to image data of the current image data part. Theneural network may be retrained using the user-verified labels and theimage data of the current image data part to obtain a retrained neuralnetwork.

Examples, embodiments or optional features, whether indicated asnon-limiting or not, are not to be understood as limiting the inventionas claimed.

It should be noted that the above-mentioned embodiments illustraterather than limit the invention, and that those skilled in the art willbe able to design many alternative embodiments without departing fromthe scope of the appended claims. In the claims, any reference signsplaced between parentheses shall not be construed as limiting the claim.Use of the verb “comprise” and its conjugations does not exclude thepresence of elements or stages other than those stated in a claim. Thearticle “a” or “an” preceding an element does not exclude the presenceof a plurality of such elements. Expressions such as “at least one of”when preceding a list or group of elements represent a selection of allor of any subset of elements from the list or group. For example, theexpression, “at least one of A, B, and C” should be understood asincluding only A, only B, only C, both A and B, both A and C, both B andC, or all of A, B, and C. The invention may be implemented by means ofhardware comprising several distinct elements, and by means of asuitably programmed computer. In the device claim enumerating severalmeans, several of these means may be embodied by one and the same itemof hardware. The mere fact that certain measures are recited in mutuallydifferent dependent claims does not indicate that a combination of thesemeasures cannot be used to advantage.

The invention claimed is:
 1. A system for annotation of image data, thesystem comprising: an input interface configured to access the imagedata to be annotated; a user interface subsystem comprising: a userinput interface configured to receive user input data from a user inputdevice operable by a user; a display output interface configured toprovide display data to a display to visualize output of the system; aprocessor configured to, using the user interface subsystem, establish auser interface which enables the user to iteratively annotate the imagedata, wherein an iteration of said iterative annotation comprises: theprocessor generating labels for a current image data part based onuser-verified labels of a previous image data part; via the userinterface, enabling the user to verify and correct said generated labelsto obtain user-verified labels for the current image data part; whereinthe processor is further configured to: generate the labels for thecurrent image data part by combining, by weighting, respective outputsof: a label propagation algorithm which propagates the user-verifiedlabels of the previous image data part to the current image data part,and a machine-learned classifier for labeling of image data, wherein themachine-learned classifier is trained on user-verified labels and imagedata and applied to image data of the current image data part; andretrain the machine-learned classifier using the user-verified labelsand the image data of the current image data part to obtain a retrainedmachine-learned classifier.
 2. The system according to claim 1, whereinthe processor is configured to retrain the machine-learned classifierbetween iterations of the iterative annotation, or between the iterativeannotation of different image data.
 3. The system according to claim 1,wherein the processor is configured to adjust the weighting during theiterative annotation of the image data, or between the iterativeannotation of different image data.
 4. The system according to claim 1,wherein the processor is configured to determine the weighting based ona metric quantifying an annotation accuracy of the label propagationalgorithm and/or the machine-learned classifier.
 5. The system accordingto claim 4, wherein the metric quantifies the annotation accuracy basedon a difference between i) the output of the label propagation algorithmand/or the output of the machine-learned classifier and ii) theuser-corrected labels.
 6. The system according to claim 3, wherein theprocessor is configured to adjust the weighting by increasing aweighting of the output of the machine-learned classifier relative tothe output of the label propagation algorithm.
 7. The system accordingto claim 6, wherein the processor is configured to start the weightingof the output of the machine-learned classifier at or substantially atzero at a start of the iterative annotation.
 8. The system according toclaim 1, wherein the weighting comprises a global weighting per imagedata part and/or a local weighting per pixel, voxel or other imagesub-region.
 9. The system according to claim 1, wherein the output ofthe label propagation algorithm and/or the output of the machine-learnedclassifier is a probability map, or one or more control points defininga contour.
 10. The system according to claim 1, wherein the labelpropagation algorithm is configured to propagate the user-verifiedlabels of the previous image data part to the current image data partbased on a similarity in image data between the previous image data partand the current image data part.
 11. The system according to claim 10,wherein the label propagation algorithm is a patch-based labelpropagation algorithm.
 12. A workstation or imaging apparatus comprisingthe system according to claim
 1. 13. A computer-implemented method forannotation of image data, the method comprising: accessing the imagedata to be annotated; using a user interface, enabling a user toiteratively annotate the image data, wherein an iteration of saiditerative annotation comprises: generating labels for a current imagedata part based on user-verified labels of a previous image data part;via the user interface, enabling the user to verify and correct saidgenerated labels to obtain user-verified labels for the current imagedata part; wherein the labels for the current image data part aregenerated by combining, by weighting, respective outputs of: a labelpropagation algorithm which propagates the user-verified labels of theprevious image data part to the current image data part, and amachine-learned classifier for labeling of image data, wherein themachine-learned classifier is trained on user-verified labels and imagedata and applied to image data of the current image data part; andretraining the machine-learned classifier using the user-verified labelsand the image data of the current image data part to obtain a retrainedmachine-learned classifier.
 14. A non-transitory computer readablemedium storing instructions that, when executed by one or moreprocessors, cause the one or more processors to: access image data to beannotated; using a user interface, enable a user to iteratively annotatethe image data, wherein an iteration of said iterative annotationcomprises: generating labels for a current image data part based onuser-verified labels of a previous image data part; via the userinterface, enabling the user to verify and correct said generated labelsto obtain user-verified labels for the current image data part; whereinthe labels for the current image data part are generated by combining,by weighting, respective outputs of: a label propagation algorithm whichpropagates the user-verified labels of the previous image data part tothe current image data part, and a machine-learned classifier forlabeling of image data, wherein the machine-learned classifier istrained on user-verified labels and image data and applied to image dataof the current image data part; and retraining the machine-learnedclassifier using the user-verified labels and the image data of thecurrent image data part to obtain a retrained machine-learnedclassifier.
 15. The non-transitory computer readable medium of claim 14,wherein the machine-learned classifier is retrained between iterationsof the iterative annotation, or between the iterative annotation ofdifferent image data.
 16. The non-transitory computer readable medium ofclaim 14, wherein weighting is adjusted during the iterative annotationof the image data or between the iterative annotation of different imagedata.
 17. The non-transitory computer readable medium of claim 14,wherein the weighting is determined based on a metric quantifying anannotation accuracy of the label propagation algorithm and/or themachine-learned classifier.
 18. The non-transitory computer readablemedium of claim 17, wherein the metric quantifies the annotationaccuracy based on a difference between i) the output of the labelpropagation algorithm and/or the output of the machine-learnedclassifier and ii) the user-corrected labels.
 19. The non-transitorycomputer readable medium of claim 16, wherein the weighting is adjustedby increasing a weighting of the output of the machine-learnedclassifier relative to the output of the label propagation algorithm.20. The non-transitory computer readable medium of claim 19, wherein theweighting of the output of the machine-learned classifier is started ator substantially at zero at a start of the iterative annotation.